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Abstract 

We employ the Bayesian framework to define a cointegration measure aimed to 
represent long term relationships between time series. For visualization of these 
relationships we introduce a dissimilarity matrix and a map based on the Sorting 
Points Into Neighborhoods (SPIN) technique, which has been previously used to 
analyze large data sets from DNA arrays. We exemplify the technique in three data 
sets: US interest rates, monthly inflation rates and gross domestic product growth 
rates. 
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1 Introduction 



Correlations are a central topic in the study of the collective properties of 
complex systems, being of particularly practical importance when systems 
of economic interest are concerned p.J. Unlike correlation, the idea of coin- 
tegration [2|3] brings in a relationship measure that is long term in nature 
being somewhat related to the concept of damage spreading in a pair of spin 
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models [I]. However, cointegration has up to now been rather absent from 
the description of physical systems and, in particular, from economic systems 
studied from a physical perspective. A set of non-stationary time series coin- 
tegrate if there exists a linear combination of them that is mean reverting. 
Plainly speaking, two appropriately scaled time series cointegrate if in the 
long term they either tend to move together or as mirror images. 

Bayesian methods provide a unifying approach to statistics [5]. They help to 
establish, from clear first principles, the methods, assumptions and approxi- 
mations made in a particular statistical analysis. A major issue in the study 
of cointegration is the detection of cointegrated sets, a problem that has been 
extensively dealt with in the econometrics literature both from classical [B] 
and Bayesian [7] perspectives. 

Dealing with extensive volumes of data is a common trend in several areas 
of science. The need to sort, cluster, organize, categorize, mine or visualize 
large data sets brings a perspective that unifies distant fields, if not at all in 
aims, at least in methods. Cross fertilization may promptly provide candidate 
solutions to problems, avoiding the need of rediscovery or worst, just plain 
non- discovery. Bioinformatics presents a good example, where the availability 
of genome, protein and DNA array data has prompted the proposal by several 
groups of new methods. From this repertoire we borrow a method, SPIN [S], 
previously developed for automated discovery of cancer associated genes. 

Our first goal in this paper is to devise a cointegration measure for time series 
of economic interest that is both physically meaningful and reasonably simple 
to compute. Our second goal is, by employing the SPIN method, to emphasize 
the importance of visual organization and presentation of relationship pictures 
(or maps) that emerge when complex systems are analysed. 

This paper is organized as follows. In the next section we derive a cointe- 
gration measure employing Bayesian statistics and briefly discuss the relation 
between cointegration and correlation and between the proposed measure and 
usual unit-root statistics. In section 3 we use the SPIN method to introduce 
the cointegration heat map as a visualization tool. In section 4 we exemplify 
the proposed technique in three macroeconomic time series: US interest rates 
(USIR), inflation rates (IFR) and gross domestic products (GDP). Conclusions 
are presented in section 5. 



2 Cointegration measure 

A pair of time series X\ and x 2 cointegrates [3] if there exists a linear combi- 
nation 
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diXx t t + a 2 x 2 ,t + b = e t 



(1) 



such that the residues e satisfy the following stationarity condition: 



e t+ i = 7e t + r} t , (2) 

where (r] t ) = 0, (77^) = cr 2 and 7 < 1. If 7 = 1 the residues are non-stationary 
and if 7 > 1 the system is unstable. Notice that 7 is related to a time scale 
r = 1/(1 — 7) for relaxation of tt to its long term mean. 

We also assume a budget constraint taking the form 



a\ + al = l. (3) 

Since eq. [T]is linear, we can impose this constraint by assuming that a% = sin(6 l ) 
and a>2 = cos(#) without loss of generality. Note that the above system has 
still more freedom arising from the following symmetry group: 



X-i ► X^ CXX{ ~~\- 7Ji 7 

b^b' = ab — aiyi — a 2 y2, 
0^0' = aa, 

7^7=7, (4) 

which means that we can change the units in which quantities are measured 
and add constants yi without interfering with the cointegration property. We, 
therefore, can partially fix the gauge so that such that the 

empiric time series averages are zero. This forces a choice of b = 0. 



That cointegration and correlation in time series fluctuations are quite distinct 
properties can be easily seen with the aid of an elementary example. Suppose 
two time series x t and y t that orbit the same random walk w t as follows: 

x t = w t + e x t (5) 
y t = w t + e y t 
w t = w t - 1 + r] t , 

where e^', e\ and r] are random i.i.d shocks with zero mean and variances 
ex 2 , (jy and a 2 ,, respectively. The correlation p between the first differences 
Ax = x t — x t _i and Ay = y t — y t „i can be easily computed yielding: 
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(AxAy) cr 

p= = — '-. (6) 

&x ® y ® x ®y 
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Clearly x and y strongly cointegrate (7 = 0). However the choice a v <C a x , a y 
would imply that the linear correlation in their fluctuations are at the same 
time very low. 

The main ingredients in a Bayesian approach are three. First we need a model 
as given by eqs. 0Q [2] and [3J Then a noise model to build the likelihood and 
finally the priors. The interesting consequence of a group of invariance as 
the one described by eq. 0] is that it, together with the budget and stability 
conditions, constrains [5] the form of the priors to: 



p( 7 )=e( 7 )e(i- 7 ), 

where G(-) is the Heaviside step function, and 



(7) 



PW«4 (8) 

With these ingredients we can calculate the posterior probability of 7 given 
the residues as: 



P{l I e) oc / da p(e | 7, a)p{a)p(j), 



(9) 



where <r min > can be made arbitrarily small without changing the main 
results to follow. 

Equations Q] and [2] combined give the following likelihood function for the 
residues: 
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p(e I 7,a)oc n -exp 
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(e t+ i - j e t ) 
2a 2 
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(10) 



Performing the integral in eq. [9] yields: 



p( T I e) « 9(7)9(1 - 7) 



T-l 



( e *+i - 7^*)' 



t=i 



For large T, the distribution of residues, given the data, can be approximated 
by 



p(e I x[, x'2) ~ 5(e t — t sin 9 + x' 2 t cos ^) 



(12) 
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Fig. 1. Left: Posterior probabilities for two synthetic pairs x±,x 2 of time series. 
Cointegrating pair, characterized by the maximum a-posteriori estimate 7 ~ 0.5 
(top). Non-cointegrating pair characterized by 7 ~ 1 (bottom). Right: Examples of 
the cointegration measure in US interest rate data. 

with 9 estimated by minimizing the variance (e 2 ) to find: 



= - arctan 
2 



L>n-<4 2 >j 



(13) 



The maximum of the posterior distribution gives an estimate for the relaxation 
time as 



7 = argmaxlogp(7 | x'i.x^) 
^argmaxlog/dephl^l.;,^; 



(14) 



Finally, we define a family of cointegration a-measures as 



d a (xi,x 2 ) = 7 . (15) 

These measures are symmetric, non-negative and agree with the usual aug- 
mented Dickey-Fuller unit-root tests (ADF) [3J in the sense that lower p-value 
t-statistics imply higher degrees of similarity as measured by the cointegration 
property (see Tabled]). 

The value of a controls the quality of visualizations generated and has been 
chosen to be a = 1 (IFR,GDP) and a = 2 (USIR) in the datasets we have 
analyzed. In FigJT] (left) we show the log-posteriors obtained for synthetic time 
series generated with T = 1000 and 7 = 0.5 and 7 = 1.0. In Fig. [T^right) we 
illustrate the cointegration measure with time series from the USIR dataset. 
Notice that it can be easily verified by a Taylor expansion of the logarithm of 
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Pair 


7 


ADF t-stat 


SLB-FP3 


0.99 


-2.21 


CP6-TC1Y 


0.95 


-4.70 


CP1-CP6 


0.92 


-6.92 


CP1-CP3 


0.87 


-8.72 


FED-CP6 


0.77 


-6.97 


FED-CP3 


0.64 


-8.47 


FED-CPl 


0.50 


-10.36 



Table 1 

ADF test and cointegration measure: Using the USIR dataset as an example, it can 
be seen that the measure estimated are consistent with the ADF tests in the sense 
that more improbable t-statistics imply stronger cointegration. The critical value at 
1% is t = —3.88 in this case. 

the posterior density (eq. [TT]) around its maximum that the error bar for the 
estimate 7 is proportional to T -1 / 2 . 



3 Cointegration heat map 



Given a set of time series we are interested in discovering low dimensional 
structures embedded in an appropriate dissimilarity matrix D. The way we 
define this dissimilarity matrix is conditioned by the use we intend for the 
data. In principle, we can define Djk = d a (xj,Xk) meaning that shorter re- 
laxation times t imply more similarity between two time series. Alternatively, 
inspired by the expression matrices employed in bioinformatics [8] , we can de- 
fine vectors cl, = (dy, d^j) representing the cointegration profile between 
time series j and each one of the N series composing the system with an ar- 
bitrary but fixed ordering. A dissimilarity matrix can be then defined along 
this lines as: 



TV 

\2 



D 3k = \J2 ( d lJ ~ d lk) ■ ( 16 ) 
\ 1=1 

In this case two time series are similar if they interact with each system com- 
ponent alike. We have observed that the latter choice yields the same basic 
structures with clearer and smoother visualization. The use of primitive mea- 
sures to build second order dissimilarity matrices is a basic idea behind the 
spectral clustering techniques [5]. However, to our knowledge, the particular 
construction described by eq. [16] has not appeared in the literature to date. 

There are several possible aims behind unsupervised segmentation based on 
a dissimilarity matrix D. Categorization from clustering algorithms has been 
used for market segmentation [TOlfTT] . For example, the Superparamagnetic 
Clustering (SPC) algorithm [T2l[T3"] has been particularly useful since the num- 
ber of clusters is not a priori known and the scale of resolution of the categories 
can be tuned by a temperature like parameter. Sometimes the data might not 
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Fig. 2. Left: Correlation heat map for the USIR dataset. Rectangles represent the 
classification yielded by the SPC technique. The general pattern is comparable with 
figure 3b on [20J. Right: Cointegration heat map for the USIR dataset with SPC 
classification again represented by rectangles. The general emergent pattern clusters 
short term Treasury instruments around n4, financial companies related instruments 
around n3 and long term instruments around nl. 

have a clear discrete class structure and here the SPIN algorithm provides a 
difference with its capability of helping identify low dimensional structures in 
a high dimensional space. Without knowing in advance what type of segmen- 
tation will emerge, the clustering and SPIN algorithms should be thought of 
as complementary. The aim of SPINing a similarity matrix is to obtain a per- 
mutation such that points close in distance are brought, by the permutation 
to places in the matrix that are also close. Since the space of permutations 
is factorially large this can easily be seen to be a potentially hard problem. 
The permutations are sequentially chosen, for example to minimize a cost 
function that penalizes large distances and puts them far from the diagonal 
or alternatively seek permutations that bring pairs with small distances near 
to the diagonal. Unless the structure can be ordered in one dimension, these 
requirements can lead to frustration. The class of cost functions proposed in 
[8] is of the form J-{P) =Tr (PDP T W), with P being a permutation of matrix 
indices and W a weight matrix which defines the algorithm. For their choices, 
namely, Side-to-Side (STS) defined as W = XX T , with Xi > Xj if i > j and 
Neighborhood defined as = exp(\i — j\o~), the minimization was shown to 
be NP-complete. The way out is to be satisfied with non optimal solutions 
that can be obtained in fast times (0(n 2 ~ 3 )) and that turn out to be just as 
informative. The problem of sorting into categories is ill posed and therefore 
there will not be something like 'the answer'. The reduction to an optimization 
problem, using either STS or Neighborhood leads to a NP-complete problem. It 
is fair to expect that any reasonable weight function will share that character- 
istic. So we have found that it is adequate to play around with the algorithms 
and apply them for different subsets, try optimizing the whole matrix, then 
choose a relevant cointegrating subset, optimize the subset, go up optimize 
the whole set, intercalate different algorithms. The result will tend to be bet- 
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Fig. 3. Left: SPINed cointegration map for the IFR dataset. The clusters that emerge 
correspond to economies with highly volatile prices (nl), developed economies with 
stable prices (n2), economies with events of hyperinflation in the period observed 
(n3) and economies with highly stable prices (n4). Right: Examples in each group. 
Group nl: Benin (la), Central African Republic (lb) and Syria (lc). Group n2: USA 
(2a), Sweden (2b), Japan (2c). Group n3: Brazil (3a), Russia (3b) and Indonesia 
(3c). Group n4: Australia (4a), New Zealand (4b) and Tuvalu (4c). 

ter as measured by the cost function. This heuristics helps escape from local 
minima, of course it does not cure the fundamental problem that there might 
be frustration in a general sorting problem. This is not really a problem, good 
albeit not optimal solutions are just as informative as a perfect solution would 
be for all practical purposes. In the following analysis we have found that the 
best visualizations were simply achieved by employing the STS solution as 
an initial condition to the Neighborhood variant iterated with a schedule for 
reducing the scale parameter a in a simulated annealing fashion. 



4 Application examples 

We exemplify the method by calculating cointegration maps for three data 
sets: (USIR) weekly US interest rates for 34 instruments from January 8, 1982 
to August 29, 1997 (T = 817 weeks) [14J ; (IFR) monthly inflation rates for 
179 countries from August, 1993 to December 2004 (T = 137 months) |15j ; 
(GDP) yearly gross domestic product growth rates for 71 countries from 1980 
to 2004 (T = 25 years) (16]. 

Measurement in soft sciences is itself a challenging activity [T7j. Socio-economic 
systems are self-aware, there are severe limits to the accuracy of statistical data 
that can be gathered and even the definition of several macroeconomic quan- 
tities is still debatable [RjlfT9] . An exception to these data quality constraints 
are the organized financial markets like those of interest rate instruments in 
dataset USIR. 
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Fig. 4. Left: SPINed cointegration map for the GDP dataset. The clusters that 
emerge correspond to countries with accelerating or decelerating economies (Group 
nl); developed countries with stable and accelerating economies (Group n2); volatile 
economies including major oil producers (Group n3) and stable economies (Group 
n4). Right: Examples in each group are. Group nl: Bangladesh (la), Tanzania (lb) 
and Pakistan (lc). Groups n2: Norway (2a), United Kingdom (2b) and Japan (2c). 
Group n3: Kuwait (3a), Venezuela (3b) and Saudi Arabia (3c). Group n4: US (4a), 
Australia (4b) and Italy (4c). 

In figure [2] (left) we show the SPINed heat map for correlation coefficients of 
time series fluctuations. Pseudocolors are assigned according to dissimilarities 
calculated with eq. [16] by replacing the cointegration measure by correlation 
coefficients. In the same figure we show as rectangles identified by nl, n6 the 
hierarchical grouping structure generated by the SPC technique (K = 7, see 
[j~2]). Characteristic of unsupervised classification techniques is the reliance of 
the results on the dissimilarity measure adopted. Despite the differences, the 
general patterns revealed in figure [2] compare well with those of figure 3b on 
[20] , which employs a classical agglomerative clustering with a metric distance 
based on linear correlation coefficients. Notice that the SPINed heat map is 
capable of showing nuances in the relationship structure that are absent in 
the traditional or SPC approaches. For example, the Treasure bill rates with 
maturities 3 and 6 months (TBA3M and TBA6M) and other instruments 
of the same maturity, in particular, Treasure securities at constant maturity 
(TC3M and TC6M) correlate alike. However, this sort of information is lost 
both in the SPC classification and in [20] with the former classifying these 
instruments accordingly with their maturities and the latter grouping TBAs 
in one group and TCs in another. 

Figure [2] (right) shows the cointegration map for USIR. Considering the re- 
liability of the estimates (<r 7 ~ 0.035 for USIR), the map produced allows a 
direct visualization of relationships through the whole set of time series with- 
out imposing any ad hoc classification criteria. The classification provided by 
SPC (K = 7) is also shown as rectangles identified by nl, n6. As we have al- 
ready discussed, correlation of the fluctuations and cointegration are different 
relationship measures. Two time series will pertain to the same cointegra- 
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tion group if they tend to orbit a common time series. The general pattern 
groups short term Treasury bills in n4, long term instruments, both treasury 
and corporate around nl and finance company related instruments (interbank 
eurodollar (EDs), certificates of deposit (CDs) and finance company papers 
(FPs)) around n3. 

Figure [3] (left) shows the SPINed cointegration map for monthly inflation data 
(IFR). Even though the estimates are less reliable in this case (cr 7 « 0.085) 
it is possible to identify groups by inspecting their mutual relationships rep- 
resented by the color map. Figure [3] (right) allows a direct interpretations of 
segmentation provided. Group nl consists of countries that exhibit volatile in- 
flation profiles with both high inflation and high deflation periods (Benin (la), 
Central African Republic (lb) and Syria (lc)). Group n2 is mainly composed 
by advanced economies with stable inflation patterns (USA (2a), Sweden (2b) 
and Japan (2c)) and countries that are very closely related to them (e.g. Mar- 
tinique, Singapore and Bahamas). Group n3 consists of countries that have 
experienced hyperinflation in the period observed (Brazil (3a), Russia (3b) 
and Indonesia (3c)). Group n4 contains countries with very stable and low 
inflation profiles, among them New Zealand (4b), that has adopted inflation 
targeting as early as 1988, Australia (4a), that also has adopted inflation 
targeting in 1993 and Tuvalu (4c) that adopts the Australian dollar as cur- 
rency. However, apart from Australia and New Zealand, all the other countries 
that have adopted inflation targeting before the period observed (1993-2004) 
(Canada, Finland, Korea, Sweden and United Kingdom) have been classified 
in the Group n2. 

The cointegration map for GDP data (Fig. H] (left)) must be dealt with care 
as this data set is smaller (T = 25) and, therefore, statistically less reliable 
than the previous two sets (cr 7 ~ 0.2). To minimize interpretation problems 
due to GDP measurement issues we have selected from the IMF database 
71 countries that have had market economies in the period observed (1980- 
2004). As a criterion to classify different groups, we have looked at general 
interaction patterns compatible with the limited reliability of the estimates. 
The SPINned matrix shows that there are four distinguishable classes, but 
that their boundaries are not sharp. This illustrates again the difference be- 
tween SPIN and traditional clustering techniques. For the latter either sharp 
boundaries (e.g. for hierarchical and K-means techniques) or some sort of 
low dimensional structure (e.g. fuzzy clustering) must be imposed even when 
there are none [21]. We, therefore, have defined Group nl as being composed 
by countries that interact with countries in Group n2. Group n2 consists of 
countries that interact with Group n4 and less strongly with Group n3. Group 
n3 is characterized by countries that do not interact with Group nl, interact 
strongly with Group n4 and less strongly with Group n2. Finally, Group n4 in- 
teracts with Groups n2 and n3 but not with Group nl. This procedure results 
in accelerating (Fig. [5] (right) Bangladesh (la) and Tanzania (lb) ) or decel- 
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erating economies (Pakistan (lc)) in Group nl; low volatility economies both 
stable (Norway (2a) and United Kingdom (2b)) and decelerating (Japan (2c)) 
in Group n2; Group n3 concentrates highly volatile unstable economies includ- 
ing developing countries and all major oil producers (Kuwait (3a), Venezuela 
(3b) and Saudi Arabia (3c)); stable economies in Group n4 (US (4a), Aus- 
tralia (4b) and Italy (4c)). Notice that the difference between Groups n4 and 
n2 is their relation with Group nl, to say, the presence of some countries with 
accelerating or decelerating growth rates in n2. 



5 Conclusion 

In this paper we have developed a simple measure for long term pairwise 
relationships in sets of time series by introducing a Bayesian estimate for 
a cointegration distance. For visualization of the relationships, with a mini- 
mum introduction of ad hoc structures, we have borrowed from the repertoire 
of Bioinformatics the SPIN ordering technique to produce cointegration heat 
maps. We have exemplified the technique in three sets of time series of financial 
and economic interest and have been capable of identifying low-dimensional 
structures of economic sense emerging from the procedure. 

Our aim in this work has been the development of tools that may be use- 
ful for discovering collective long term structures in economic time series. We 
think that a thorough understanding of the economic phenomena behind the 
observed patterns depends on our capability of describing the system interac- 
tions in some detail, what is out of the scope of the present work. Considering 
that socio-economic systems belong to a class of complex systems with unre- 
liably known interactions and dynamics we regard pattern recognition tasks 
as hereby described a first important step towards a deeper quantitative un- 
derstanding of such systems. 
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